TF1.x vs TF2.0
Pioneering lirary for building deep learning models, launch November 2015. Its free, open source, originally developed by Google.
Other libraries:
- PyTorch – from FB, October 2016
- TensorFlow 2.0
- Major new version, September 2019
- Dynamoic computation graphs
- Not backward compatible with TF1
- Closer to PyTorch
TF1.x vs TF2.0 vs PyTorch
TF1.x | PyTorch |
Computation graph is static | Computation graph is dynmic |
tf.Session for separation from Python | Tightly Integrated with python |
debugging via tfdbg | Debugging with PyCharm, pdb |
Visualization TensorBoard | Visualization using matplotlib, seaborn |
tf.device and tf.DeviceSpec to use GPUs | torch.nn.DataPattern |
TF1.x | TF2.x |
Computation graph is static | Both dynamic and static supported |
Heavyweight build then run cycle, overkill for simple apps | Eager execution for dev, lazy execution for deployment |
Low-level APIs with multiple high level APIs | Tightly integrated with Keras as high level API |
tf.Seswsion for hard separation from Python | No sessions, just functions; tf.function for adv features |
Keras
Central part of tightly connected TensorFlow 2.0, covering every part of machine learning. Must be used with TensorFlow 2.0
- TensorFlow 2.0 includes Keras API
- HIgh level API in tf.keras
- First class support for estimators, pipelines, eager executing
- tf.kears
- build, train, evaluate models
- save/restore models
- leverage GPUs
Eager Execution
Single biggest change in TF2.0; needed for computation graphs
Neural Network Overview
ML-Based Classifer requires:
- Training – feed large data classified correctly
- Prediction – used to classify new instances not seen before
- Feedback – loss function
- Classification output is put back into the ML-based Classifier
- This helps improve the model
Neural Network
Deep Learning algorithm – learns what features matter. This is most common class of deep learning algorithms. The fundamental building block is the neuron. It is comprised many layers of many neurons. Each layer learns something different from the data.
A single neuron is a mathematical function that takes several input values and generates single output value that could be fed into many neurons at the next layer. So imagine it as a chain for mathematical functions, each being different.
There is a W weight is the connection between neurons. When W increases, that means the connection between them are strong. The strongest connections are what end up being the following output.
Each neuron only applies two simple functions to its iiputs:
- affine transformation = learning linear relationships between inputs and output. It takes the weights, input, and biases
- Wx + b
- activation function = helps discover non-linear relationships
- max(Wx+b)
- These are non-linear functions
- ReLU = Rectified Linear Unit
- logit
- tanh
- step
- sigmoid
ReLU
Most commonly used activiation function. Rectified Linear Unit => ReLU(x) = max(0,x)
SoftMax
Another popular activiation function for classifications. This is a probability score. This is an “S curve” or logit curve
Gradient
All activation functions have a gradient. In order to train,
Tensor
Tensor is a data structure that represents multi-dimensional arrays of numerical values.
Here’s a breakdown:
1. Tensor Definition
- A tensor is a generalization of scalars (single values), vectors (1D arrays), and matrices (2D arrays) to higher dimensions.
- Tensors are used to store data (like the input, weights, or outputs of a neural network) and are fundamental to how data is represented and processed in machine learning frameworks.
2. Activation Function
- An activation function is a mathematical function applied to a neuron’s output to introduce non-linearities into the model, helping it learn complex patterns.
- Common activation functions include ReLU, Sigmoid, and Tanh.
3. Neuron
- A neuron is a basic unit in a neural network, loosely inspired by the biological neurons in our brain. It takes in input, applies weights and biases, and passes the result through an activation function.
- Neurons are organized into layers and are the building blocks of neural networks.
Tensors in Context
In neural networks, tensors are used to hold the inputs, weights, and outputs. For instance:
- Input Tensors hold the input data (like images or text data).
- Weight Tensors store weights that get updated during training.
- Output Tensors hold the results from each layer or neuron after applying weights and activation functions.
tensor in TensorFlow can be thought of as a matrix, especially when it has two dimensions (rows and columns). However, tensors are more general than matrices, as they can represent data in any number of dimensions:
- 0D tensor: A single number, also called a scalar. For example,
5
is a 0D tensor. - 1D tensor: A vector or a 1D array, like
[5, 10, 15]
. - 2D tensor: A matrix, like
[[5, 10], [15, 20]]
, with rows and columns. - 3D tensor: An array of matrices, often visualized as a “cube” of data. For instance, an RGB image can be represented by a 3D tensor with height, width, and color channels.
- ND tensor: Higher-dimensional structures (4D, 5D, etc.), which can represent even more complex data structures, like batches of images over time.
Computation Graphs
Types of computation graphs
- Static = lazy execution – symbolic programming of neural networks
- Dynamic = eager execution – imperative programming of neural networks
TensorFlow 2.0 supports both graph types
Best Practice = develop using dynmic, deploy using static
Symbolic Programming
- First define ops, then execute
- Define functions abstractly, no actual computation
- computation explicitly compiled before evaluation
- ex Java/C++
Imperative Programming
- Execute performaed as operations defined
- code actually executeed as function defined
- No explicit compilation step
- ex: Python/no compile
Symbolic Computation Graph
- First define computation, then run
- Computation first defined using placeholders
- Computation explicitly compiled before exec
- Static – define, then run
- TF1.0
Imperative Graphs
- Computations run as they are defined
- Computation directly performed
- No explicit compile
- Dynamic – define by run
- PyTorch
Sequential API using Keras Layers
Core data structures
- layer
- models
Layers come together to create models
Keras Building Blocks
- Sequential Models
- Functional APIs
- Model Subclassing
- Custom Layers
Sequential Models
Consist of simple stack of layers, and so cannot be used to build complex model topologies. Simply a linear stack of layers.
Steps:
- Instantiate Model
- Import, instantiate
- Shape of First Layer
- input_shape, input_dim, input_length
- Shape of remaining layers are inferred
- Add Layers
- Use DNNs, RNNs, CNNs etc
- Compile Model
- model.compile()
- Train Model
- Epochs, batch size, training data
- model.fit()
- Use Model
- Using test data
- model.predict()
Model Compilation
- model.compile() = ties model to TF backend
- Must specify optimizer and loss function
- Several other optional arguments
eof